Goto

Collaborating Authors

 micro-expression recognition


Temporal and Spatial Feature Fusion Framework for Dynamic Micro Expression Recognition

Liu, Feng, Nan, Bingyu, Qian, Xuezhong, Fu, Xiaolan

arXiv.org Artificial Intelligence

When emotions are repressed, an individual's true feelings may be revealed through micro-expressions. Consequently, micro-expressions are regarded as a genuine source of insight into an individual's authentic emotions. However, the transient and highly localised nature of micro-expressions poses a significant challenge to their accurate recognition, with the accuracy rate of micro-expression recognition being as low as 50%, even for professionals. In order to address these challenges, it is necessary to explore the field of dynamic micro expression recognition (DMER) using multimodal fusion techniques, with special attention to the diverse fusion of temporal and spatial modal features. In this paper, we propose a novel Temporal and Spatial feature Fusion framework for DMER (TSFmicro). This framework integrates a Retention Network (RetNet) and a transformer-based DMER network, with the objective of efficient micro-expression recognition through the capture and fusion of temporal and spatial relations. Meanwhile, we propose a novel parallel time-space fusion method from the perspective of modal fusion, which fuses spatio-temporal information in high-dimensional feature space, resulting in complementary "where-how" relationships at the semantic level and providing richer semantic information for the model. The experimental results demonstrate the superior performance of the TSFmicro method in comparison to other contemporary state-of-the-art methods. This is evidenced by its effectiveness on three well-recognised micro-expression datasets.


A Benchmark for Incremental Micro-expression Recognition

Lai, Zhengqin, Hong, Xiaopeng, Wang, Yabin, Li, Xiaobai

arXiv.org Artificial Intelligence

Micro-expression recognition plays a pivotal role in understanding hidden emotions and has applications across various fields. Traditional recognition methods assume access to all training data at once, but real-world scenarios involve continuously evolving data streams. To respond to the requirement of adapting to new data while retaining previously learned knowledge, we introduce the first benchmark specifically designed for incremental micro-expression recognition. Our contributions include: Firstly, we formulate the incremental learning setting tailored for micro-expression recognition. Secondly, we organize sequential datasets with carefully curated learning orders to reflect real-world scenarios. Thirdly, we define two cross-evaluation-based testing protocols, each targeting distinct evaluation objectives. Finally, we provide six baseline methods and their corresponding evaluation results. This benchmark lays the groundwork for advancing incremental micro-expression recognition research. All source code used in this study will be publicly available at https://github.com/ZhengQinLai/IMER-benchmark.


Facial Expression Analysis and Its Potentials in IoT Systems: A Contemporary Survey

Shanggua, Zixuan, Dong, Yanjie, Guo, Song, Leung, Victor C. M., Deen, M. Jamal, Hu, Xiping

arXiv.org Artificial Intelligence

Facial expressions convey human emotions and can be categorized into macro-expressions (MaEs) and micro-expressions (MiEs) based on duration and intensity. While MaEs are voluntary and easily recognized, MiEs are involuntary, rapid, and can reveal concealed emotions. The integration of facial expression analysis with Internet-of-Thing (IoT) systems has significant potential across diverse scenarios. IoT-enhanced MaE analysis enables real-time monitoring of patient emotions, facilitating improved mental health care in smart healthcare. Similarly, IoT-based MiE detection enhances surveillance accuracy and threat detection in smart security. This work aims at providing a comprehensive overview of research progress in facial expression analysis and explores its integration with IoT systems. We discuss the distinctions between our work and existing surveys, elaborate on advancements in MaE and MiE techniques across various learning paradigms, and examine their potential applications in IoT. We highlight challenges and future directions for the convergence of facial expression-based technologies and IoT systems, aiming to foster innovation in this domain. By presenting recent developments and practical applications, this study offers a systematic understanding of how facial expression analysis can enhance IoT systems in healthcare, security, and beyond.


Multimodal Latent Emotion Recognition from Micro-expression and Physiological Signals

Zhang, Liangfei, Qian, Yifei, Arandjelovic, Ognjen, Zhu, Anthony

arXiv.org Artificial Intelligence

The proposed approach presents a novel multimodal learning framework that combines ME and PS, including a 1D separable and mixable depthwise inception network, a standardised normal distribution weighted feature fusion method, and depth/physiology guided attention modules for multimodal learning. Experimental results show that the proposed approach outperforms the benchmark method, with the weighted fusion method and guided attention modules both contributing to enhanced performance.


Geometric Graph Representation with Learnable Graph Structure and Adaptive AU Constraint for Micro-Expression Recognition

Wei, Jinsheng, Peng, Wei, Lu, Guanming, Li, Yante, Yan, Jingjie, Zhao, Guoying

arXiv.org Artificial Intelligence

Micro-expression recognition (MER) is valuable because micro-expressions (MEs) can reveal genuine emotions. Most works take image sequences as input and cannot effectively explore ME information because subtle ME-related motions are easily submerged in unrelated information. Instead, the facial landmark is a low-dimensional and compact modality, which achieves lower computational cost and potentially concentrates on ME-related movement features. However, the discriminability of facial landmarks for MER is unclear. Thus, this paper explores the contribution of facial landmarks and proposes a novel framework to efficiently recognize MEs. Firstly, a geometric two-stream graph network is constructed to aggregate the low-order and high-order geometric movement information from facial landmarks to obtain discriminative ME representation. Secondly, a self-learning fashion is introduced to automatically model the dynamic relationship between nodes even long-distance nodes. Furthermore, an adaptive action unit loss is proposed to reasonably build the strong correlation between landmarks, facial action units and MEs. Notably, this work provides a novel idea with much higher efficiency to promote MER, only utilizing graph-based geometric features. The experimental results demonstrate that the proposed method achieves competitive performance with a significantly reduced computational cost. Furthermore, facial landmarks significantly contribute to MER and are worth further study for high-efficient ME analysis.


Multi-scale multi-modal micro-expression recognition algorithm based on transformer

Wang, Fengping, Li, Jie, Qi, Chun, Wang, Lin, Wang, Pan

arXiv.org Artificial Intelligence

A micro-expression is a spontaneous unconscious facial muscle movement that can reveal the true emotions people attempt to hide. Although manual methods have made good progress and deep learning is gaining prominence. Due to the short duration of micro-expression and different scales of expressed in facial regions, existing algorithms cannot extract multi-modal multi-scale facial region features while taking into account contextual information to learn underlying features. Therefore, in order to solve the above problems, a multi-modal multi-scale algorithm based on transformer network is proposed in this paper, aiming to fully learn local multi-grained features of micro-expressions through two modal features of micro-expressions - motion features and texture features. To obtain local area features of the face at different scales, we learned patch features at different scales for both modalities, and then fused multi-layer multi-headed attention weights to obtain effective features by weighting the patch features, and combined cross-modal contrastive learning for model optimization. We conducted comprehensive experiments on three spontaneous datasets, and the results show the accuracy of the proposed algorithm in single measurement SMIC database is up to 78.73% and the F1 value on CASMEII of the combined database is up to 0.9071, which is at the leading level.


Seeking Salient Facial Regions for Cross-Database Micro-Expression Recognition

Jiang, Xingxun, Zong, Yuan, Zheng, Wenming

arXiv.org Artificial Intelligence

This paper focuses on the research of cross-database micro-expression recognition, in which the training and test micro-expression samples belong to different microexpression databases. Mismatched feature distributions between the training and testing micro-expression feature degrade the performance of most well-performing micro-expression methods. To deal with cross-database micro-expression recognition, we propose a novel domain adaption method called Transfer Group Sparse Regression (TGSR). TGSR learns a sparse regression matrix for selecting salient facial local regions and the corresponding relationship of the training set and test set. We evaluate our TGSR model in CASME II and SMIC databases. Experimental results show that the proposed TGSR achieves satisfactory performance and outperforms most state-of-the-art subspace learning-based domain adaption methods.


Micro-Expression Recognition Based on Pixel Residual Sum and Cropped Gaussian Pyramid

#artificialintelligence

Facial micro-expression(ME) recognition has great significance for the progress of human society and could find person's true feelings. Meanwhile, ME recognition faces a huge challenge, since it is difficult to detect and easy to be disturbed by the environment. In this paper, we propose two novel preprocessing methods based on Pixel Residual Sum. These methods can preprocess video clips according to the unit pixel displacement of images, resist environmental interference and be easy to extract subtle facial feature. Furthermore, we propose a Cropped Gaussian Pyramid with Overlapping(CGPO) module, which divides images of different resolutions through Gaussian pyramids and crops different resolutions image into multiple overlapping subplot. Then, we use a convolutional network of progressively increasing channels based on the depthwise convolution to extract preliminary features. Finally, we fuse preliminary features and make position embedding to get last features. Our experiments show that the proposed methods and model have better performance than the well-known methods.


Region attention and graph embedding network for occlusion objective class-based micro-expression recognition

Mao, Qirong, Zhou, Ling, Zheng, Wenming, Shao, Xiuyan, Huang, Xiaohua

arXiv.org Artificial Intelligence

Micro-expression recognition (\textbf{MER}) has attracted lots of researchers' attention in a decade. However, occlusion will occur for MER in real-world scenarios. This paper deeply investigates an interesting but unexplored challenging issue in MER, \ie, occlusion MER. First, to research MER under real-world occlusion, synthetic occluded micro-expression databases are created by using various mask for the community. Second, to suppress the influence of occlusion, a \underline{R}egion-inspired \underline{R}elation \underline{R}easoning \underline{N}etwork (\textbf{RRRN}) is proposed to model relations between various facial regions. RRRN consists of a backbone network, the Region-Inspired (\textbf{RI}) module and Relation Reasoning (\textbf{RR}) module. More specifically, the backbone network aims at extracting feature representations from different facial regions, RI module computing an adaptive weight from the region itself based on attention mechanism with respect to the unobstructedness and importance for suppressing the influence of occlusion, and RR module exploiting the progressive interactions among these regions by performing graph convolutions. Experiments are conducted on handout-database evaluation and composite database evaluation tasks of MEGC 2018 protocol. Experimental results show that RRRN can significantly explore the importance of facial regions and capture the cooperative complementary relationship of facial regions for MER. The results also demonstrate RRRN outperforms the state-of-the-art approaches, especially on occlusion, and RRRN acts more robust to occlusion.


A comparative study on movement feature in different directions for micro-expression recognition

Wei, Jinsheng, Lu, Guanming, Yan, Jingjie

arXiv.org Artificial Intelligence

Micro-expression can reflect people's real emotions. Recognizing micro-expressions is difficult because they are small motions and have a short duration. As the research is deepening into micro-expression recognition, many effective features and methods have been proposed. To determine which direction of movement feature is easier for distinguishing micro-expressions, this paper selects 18 directions (including three types of horizontal, vertical and oblique movements) and proposes a new low-dimensional feature called the Histogram of Single Direction Gradient (HSDG) to study this topic. In this paper, HSDG in every direction is concatenated with LBP-TOP to obtain the LBP with Single Direction Gradient (LBP-SDG) and analyze which direction of movement feature is more discriminative for micro-expression recognition. As with some existing work, Euler Video Magnification (EVM) is employed as a preprocessing step. The experiments on the CASME II and SMIC-HS databases summarize the effective and optimal directions and demonstrate that HSDG in an optimal direction is discriminative, and the corresponding LBP-SDG achieves state-of-the-art performance using EVM.